Comparison of approaches for instrumentally predicting the quality of text-to-speech systems
نویسندگان
چکیده
In this paper, we compare and combine different approaches for instrumentally predicting the perceived quality of Text-to-Speech systems. First, a log-likelihood is determined by comparing features extracted from the synthesized speech signal with features trained on natural speech. Second, parameters are extracted which capture quality-relevant degradations of the synthesized speech signal. Both approaches are combined and evaluated on three auditory test databases. The results show that auditory quality judgments can in many cases be predicted with a sufficiently high accuracy and reliability, but that there are considerable differences, mainly between male and female speech samples.
منابع مشابه
Comparison of Approaches for Instrumentally Predicting the Quality of Text-to-Speech Systems: Data from Blizzard Challenges
In this paper, we compare and combine different approaches for instrumentally predicting the perceived quality of Text-toSpeech systems. First, a Log-Likelihood is determined by comparing features extracted from synthesized speech signals with features trained on natural speech. Second, parameters are extracted which capture quality-relevant degradations of the synthesized speech signal. Both a...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملمراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی
Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...
متن کاملCipher text only attack on speech time scrambling systems using correction of audio spectrogram
Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...
متن کاملQualitative and Quantitative Examination of Text Type Readabilities: A Comparative Analysis
This study compared 2 main approaches to readability assessment. Thequantitative approach applied idea density based on part of speech tagging andcompared 3 sets of text types (i.e., narrative, expository, and argumentative) withrespect to their ease of reading. The qualitative approach was done throughdeveloping questionnaires measuring intermediate EFL learners’ perceptions oncontent, motivat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010